Advances in fast multistream diarization based on the information bottleneck framework

نویسندگان

  • Deepu Vijayasenan
  • Fabio Valente
  • Hervé Bourlard
چکیده

Multistream diarization is an effective way to improve the diarization performance, MFCC and Time Delay Of Arrivals (TDOA) being the most commonly used features. This paper extends our previous work on information bottleneck diarization aiming to include large number of features besides MFCC and TDOA while keeping computational costs low. At first HMM/GMM and IB systems are compared in case of two and four feature streams and analysis of errors is performed. Results on a dataset of 17 meetings show that, in spite of comparable oracle performances, the IB system is more robust to feature weight variations. Then a sequential optimization is introduced that further improves the speaker error by 5 − 8% relative. In the last part, computational issues are discussed. The proposed approach is significantly faster and its complexity marginally grows with the number of feature streams running in 0.75 real time even with four streams achieving a speaker error equal to 6%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of TDOA features in information bottleneck framework for fast speaker diarization

In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In [9], it is shown that TDOA can be used as additional features together with conventional spectral features for improving speak...

متن کامل

Multistream Diarization Fusion Using the Minimum Variance Bayesian Information Criterion

Speaker diarization is necessary with ubiquitous and individualized recorders. We focus on the specific task of speaker diarization from two information streams, two microphones, assigned to two participants of interest. In real scenarios, speakers may be co-located, in noisy environments with interfering speakers. Multistream diarization can exploit additional information and diarization fusio...

متن کامل

Speaker diarization of spontaneous meeting room conversations

Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization systems have isolated three main issues with the systems; overlapping speech, effects of background noise and speech/nonspeech detection errors on...

متن کامل

Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings

Improved diarization results can be obtained through combination of multiple systems. Several combination techniques have been proposed based on output voting, initialization and also integrated approaches. This paper proposes and investigates a novel approach to combine diarization systems through the use of features. A first diarization system, based on the Information Bottleneck, is used to ...

متن کامل

Integrating online i-vector extractor with information bottleneck based speaker diarization system

Conventional approaches to speaker diarization use short-term features such as Mel Frequency Cepstral Co-efficients (MFCC). Features such as i-vectors have been used on longer segments (minimum 2.5 seconds of speech). Using i-vectors for speaker diarization has been shown to be beneficial as it models speaker information explicitly. In this paper, the i-vector modelling technique is adapted to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010